Ranking and Empirical Minimization of U - Statistics
نویسندگان
چکیده
The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is “better,” with minimum ranking risk. Since the natural estimates of the risk are of the form of a U -statistic, results of the theory of U -processes are required for investigating the consistency of empirical risk minimizers. We establish, in particular, a tail inequality for degenerate U -processes, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied.
منابع مشابه
Ranking and Scoring Using Empirical Risk Minimization
A general model is proposed for studying ranking problems. We investigate learning methods based on empirical minimization of the natural estimates of the ranking risk. The empirical estimates are of the form of a U -statistic. Inequalities from the theory of U -statistics and U processes are used to obtain performance bounds for the empirical risk minimizers. Convex risk minimization methods a...
متن کاملScaling-up Empirical Risk Minimization: Optimization of Incomplete $U$-statistics
In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by U-statistics of degree d ≥ 1, i.e. functionals of the training data with low variance that take the form of averages over k-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sample siz...
متن کاملMaximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling
It is the goal of this paper to extend the Empirical Risk Minimization (ERM) paradigm, from a practical perspective, to the situation where a natural estimate of the risk is of the form of a K-sample U -statistics, as it is the case in the K-partite ranking problem for instance. Indeed, the numerical computation of the empirical risk is hardly feasible if not infeasible, even for moderate sampl...
متن کاملRanking - convex risk minimization
The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0−1 loss. It makes these procedures computationally efficient. Hence, conv...
متن کاملSGD Algorithms based on Incomplete U-statistics: Large-Scale Minimization of Empirical Risk
In many learning problems, ranging from clustering to ranking through metric learning, empirical estimates of the risk functional consist of an average over tuples (e.g., pairs or triplets) of observations, rather than over individual observations. In this paper, we focus on how to best implement a stochastic approximation approach to solve such risk minimization problems. We argue that in the ...
متن کامل